A Theoretical and Empirical Analysis of Reward Transformations in Multi-Objective Stochastic Games

نویسندگان

Patrick Mannion

Jim Duggan

Enda Howley

چکیده

Reward shaping has been proposed as a means to address the credit assignment problem in Multi-Agent Systems (MAS). Two popular shaping methods are Potential-Based Reward Shaping and difference rewards, and both have been shown to improve learning speed and the quality of joint policies learned by agents in single-objective MAS. In this work we discuss the theoretical implications of applying these approaches to multi-objective MAS, and evaluate their efficacy using a new multi-objective benchmark domain where the true set of Pareto optimal system utilities is known.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysing the Effects of Reward Shaping in Multi-Objective Stochastic Games

The majority of Multi-Agent Reinforcement Learning (MARL) implementations aim to optimise systems with respect to a single objective, despite the fact that many real world problems are inherently multi-objective in nature. Research into multi-objective MARL is still in its infancy, and few studies to date have dealt with the issue of credit assignment. Reward shaping has been proposed as a mean...

متن کامل

Multi-item inventory model with probabilistic demand function under permissible delay in payment and fuzzy-stochastic budget constraint: A signomial geometric programming method

This study proposes a new multi-item inventory model with hybrid cost parameters under a fuzzy-stochastic constraint and permissible delay in payment. The price and marketing expenditure dependent stochastic demand and the demand dependent the unit production cost are considered. Shortages are allowed and partially backordered. The main objective of this paper is to determine selling price, mar...

متن کامل

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be not optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert strategies, we introduce a new objective function that directly pits experts against Nash Equilibrium strategies, and we design an algorithm to solve fo...

متن کامل

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

We extend the potential-based shapingmethod fromMarkov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.

متن کامل

Designing a new multi-objective fuzzy stochastic DEA model in a dynamic ‎environment to estimate efficiency of decision making units (Case Study: An Iranian Petroleum Company)

This ‎paper presents a new multi-objective fuzzy stochastic data envelopment analysis model (MOFS-DEA) under mean chance constraints and common weights to estimate the efficiency of decision making units for future financial periods of them. In the initial MOFS-DEA ‏model, the outputs and inputs are ‎characterized by random triangular fuzzy variables with normal distribution, in which ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

A Theoretical and Empirical Analysis of Reward Transformations in Multi-Objective Stochastic Games

نویسندگان

چکیده

منابع مشابه

Analysing the Effects of Reward Shaping in Multi-Objective Stochastic Games

Multi-item inventory model with probabilistic demand function under permissible delay in payment and fuzzy-stochastic budget constraint: A signomial geometric programming method

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

Designing a new multi-objective fuzzy stochastic DEA model in a dynamic ‎environment to estimate efficiency of decision making units (Case Study: An Iranian Petroleum Company)

عنوان ژورنال:

اشتراک گذاری